identification model
Wav2Arrest 2.0: Long-Horizon Cardiac Arrest Prediction with Time-to-Event Modeling, Identity-Invariance, and Pseudo-Lab Alignment
Kataria, Saurabh, Fattahi, Davood, Wang, Minxiao, Xiao, Ran, Clark, Matthew, Ruchti, Timothy, Mai, Mark, Hu, Xiao
High-frequency physiological waveform modality offers deep, real-time insights into patient status. Recently, physiological foundation models based on Photoplethysmography (PPG), such as PPG-GPT, have been shown to predict critical events, including Cardiac Arrest (CA). However, their powerful representation still needs to be leveraged suitably, especially when the downstream data/label is scarce. We offer three orthogonal improvements to improve PPG-only CA systems by using minimal auxiliary information. First, we propose to use time-to-event modeling, either through simple regression to the event onset time or by pursuing fine-grained discrete survival modeling. Second, we encourage the model to learn CA-focused features by making them patient-identity invariant. This is achieved by first training the largest-scale de-identified biometric identification model, referred to as the p-vector, and subsequently using it adversarially to deconfound cues, such as person identity, that may cause overfitting through memorization. Third, we propose regression on the pseudo-lab values generated by pre-trained auxiliary estimator networks. This is crucial since true blood lab measurements, such as lactate, sodium, troponin, and potassium, are collected sparingly. Via zero-shot prediction, the auxiliary networks can enrich cardiac arrest waveform labels and generate pseudo-continuous estimates as targets. Our proposals can independently improve the 24-hour time-averaged AUC from the 0.74 to the 0.78-0.80 range. We primarily improve over longer time horizons with minimal degradation near the event, thus pushing the Early Warning System research. Finally, we pursue multi-task formulation and diagnose it with a high gradient conflict rate among competing losses, which we alleviate via the PCGrad optimization technique.
- North America > United States > Georgia > Fulton County > Atlanta (0.04)
- North America > United States > California > Orange County > Irvine (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
DBR: Divergence-Based Regularization for Debiasing Natural Language Understanding Models
Li, Zihao, Tang, Ruixiang, Cheng, Lu, Wang, Shuaiqiang, Yin, Dawei, Du, Mengnan
Pre-trained language models (PLMs) have achieved impressive results on various natural language processing tasks. However, recent research has revealed that these models often rely on superficial features and shortcuts instead of developing a genuine understanding of language, especially for natural language understanding (NLU) tasks. Consequently, the models struggle to generalize to out-of-domain data. In this work, we propose Divergence Based Regularization (DBR) to mitigate this shortcut learning behavior. Our method measures the divergence between the output distributions for original examples and examples where shortcut tokens have been masked. This process prevents the model's predictions from being overly influenced by shortcut features or biases. We evaluate our model on three NLU tasks and find that it improves out-of-domain performance with little loss of in-domain accuracy. Our results demonstrate that reducing the reliance on shortcuts and superficial features can enhance the generalization ability of large pre-trained language models.
- Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
- North America > United States > New Jersey (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
Speech Rhythm-Based Speaker Embeddings Extraction from Phonemes and Phoneme Duration for Multi-Speaker Speech Synthesis
Fujita, Kenichi, Ando, Atsushi, Ijima, Yusuke
This paper proposes a speech rhythm-based method for speaker embeddings to model phoneme duration using a few utterances by the target speaker. Speech rhythm is one of the essential factors among speaker characteristics, along with acoustic features such as F0, for reproducing individual utterances in speech synthesis. A novel feature of the proposed method is the rhythm-based embeddings extracted from phonemes and their durations, which are known to be related to speaking rhythm. They are extracted with a speaker identification model similar to the conventional spectral feature-based one. We conducted three experiments, speaker embeddings generation, speech synthesis with generated embeddings, and embedding space analysis, to evaluate the performance. The proposed method demonstrated a moderate speaker identification performance (15.2% EER), even with only phonemes and their duration information. The objective and subjective evaluation results demonstrated that the proposed method can synthesize speech with speech rhythm closer to the target speaker than the conventional method. We also visualized the embeddings to evaluate the relationship between the distance of the embeddings and the perceptual similarity. The visualization of the embedding space and the relation analysis between the closeness indicated that the distribution of embeddings reflects the subjective and objective similarity.
IoTGAN: GAN Powered Camouflage Against Machine Learning Based IoT Device Identification
Hou, Tao, Wang, Tao, Lu, Zhuo, Liu, Yao, Sagduyu, Yalin
With the proliferation of IoT devices, researchers have developed a variety of IoT device identification methods with the assistance of machine learning. Nevertheless, the security of these identification methods mostly depends on collected training data. In this research, we propose a novel attack strategy named IoTGAN to manipulate an IoT device's traffic such that it can evade machine learning based IoT device identification. In the development of IoTGAN, we have two major technical challenges: (i) How to obtain the discriminative model in a black-box setting, and (ii) How to add perturbations to IoT traffic through the manipulative model, so as to evade the identification while not influencing the functionality of IoT devices. To address these challenges, a neural network based substitute model is used to fit the target model in black-box settings, it works as a discriminative model in IoTGAN. A manipulative model is trained to add adversarial perturbations into the IoT device's traffic to evade the substitute model. Experimental results show that IoTGAN can successfully achieve the attack goals. We also develop efficient countermeasures to protect machine learning based IoT device identification from been undermined by IoTGAN.
- North America > United States > Florida > Hillsborough County > Tampa (0.14)
- North America > United States > New Mexico > Doña Ana County > Las Cruces (0.04)
- North America > United States > Maryland > Montgomery County > Rockville (0.04)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine (0.93)
- Transportation (0.88)
Nuisances via Negativa: Adjusting for Spurious Correlations via Data Augmentation
Puli, Aahlad, Joshi, Nitish, He, He, Ranganath, Rajesh
In prediction tasks, there exist features that are related to the label in the same way across different settings for that task; these are semantic features or semantics. Features with varying relationships to the label are nuisances. For example, in detecting cows from natural images, the shape of the head is a semantic but because images of cows often have grass backgrounds but not always, the background is a nuisance. Relationships between a nuisance and the label are unstable across settings and, consequently, models that exploit nuisance-label relationships face performance degradation when these relationships change. Direct knowledge of a nuisance helps build models that are robust to such changes, but requires extra annotations beyond labels and covariates. In this paper, we develop an alternative way to produce robust models by data augmentation. These data augmentations corrupt semantic information to produce models that identify and adjust for where nuisances drive predictions. We study semantic corruptions in powering different spurious-correlation avoiding methods on multiple out-of distribution (OOD) tasks like classifying waterbirds, natural language inference (NLI), and detecting cardiomegaly in chest X-rays.
Take One Gram of Neural Features, Get Enhanced Group Robustness
Roburin, Simon, Corbière, Charles, Puy, Gilles, Thome, Nicolas, Aubry, Matthieu, Marlet, Renaud, Pérez, Patrick
Predictive performance of machine learning models trained with empirical risk minimization (ERM) can degrade considerably under distribution shifts. The presence of spurious correlations in training datasets leads ERM-trained models to display high loss when evaluated on minority groups not presenting such correlations. Extensive attempts have been made to develop methods improving worst-group robustness. However, they require group information for each training input or at least, a validation set with group labels to tune their hyperparameters, which may be expensive to get or unknown a priori. In this paper, we address the challenge of improving group robustness without group annotation during training or validation. To this end, we propose to partition the training dataset into groups based on Gram matrices of features extracted by an ``identification'' model and to apply robust optimization based on these pseudo-groups. In the realistic context where no group labels are available, our experiments show that our approach not only improves group robustness over ERM but also outperforms all recent baselines
- Europe > France > Île-de-France > Paris > Paris (0.04)
- North America > United States > California (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Improving group robustness under noisy labels using predictive uncertainty
Oh, Dongpin, Lee, Dae, Byun, Jeunghyun, Shin, Bonggun
The standard empirical risk minimization (ERM) can underperform on certain minority groups (i.e., waterbirds in lands or landbirds in water) due to the spurious correlation between the input and its label. Several studies have improved the worst-group accuracy by focusing on the high-loss samples. The hypothesis behind this is that such high-loss samples are \textit{spurious-cue-free} (SCF) samples. However, these approaches can be problematic since the high-loss samples may also be samples with noisy labels in the real-world scenarios. To resolve this issue, we utilize the predictive uncertainty of a model to improve the worst-group accuracy under noisy labels. To motivate this, we theoretically show that the high-uncertainty samples are the SCF samples in the binary classification problem. This theoretical result implies that the predictive uncertainty is an adequate indicator to identify SCF samples in a noisy label setting. Motivated from this, we propose a novel ENtropy based Debiasing (END) framework that prevents models from learning the spurious cues while being robust to the noisy labels. In the END framework, we first train the \textit{identification model} to obtain the SCF samples from a training set using its predictive uncertainty. Then, another model is trained on the dataset augmented with an oversampled SCF set. The experimental results show that our END framework outperforms other strong baselines on several real-world benchmarks that consider both the noisy labels and the spurious-cues.
- North America > United States > Georgia > Fulton County > Atlanta (0.04)
- Asia > South Korea > Seoul > Seoul (0.04)
Parameter Identification of a PN-Guided Incoming Missile Using an Improved Multiple-Model Mechanism
Wang, Yinhan, Wang, Jiang, Fan, Shipeng
An active defense against an incoming missile requires information of it, including a guidance law parameter and a first-order lateral time constant. To this end, assuming that a missile with a proportional navigation (PN) guidance law attempts to attack an aerial target with bang-bang evasive maneuvers, a parameter identification model based on the gated recurrent unit (GRU) neural network is built in this paper. The analytic identification solutions for the guidance law parameter and the first-order lateral time constant are derived. The inputs of the identification model are available kinematic information between the aircraft and the missile, while the outputs contain the regression results of missile parameters. To increase the training speed and the identification accuracy of the Model, an output processing method called improved multiplemodel mechanism (IMMM) is proposed in this paper. The effectiveness of IMMM and the performance of the established model are demonstrated through numerical simulations under various engagement scenarios.
Adversarial Deep Learning in EEG Biometrics
Ozdenizci, Ozan, Wang, Ye, Koike-Akino, Toshiaki, Erdogmus, Deniz
Deep learning methods for person identification based on electroencephalographic (EEG) brain activity encounters the problem of exploiting the temporally correlated structures or recording session specific variability within EEG. Furthermore, recent methods have mostly trained and evaluated based on single session EEG data. We address this problem from an invariant representation learning perspective. We propose an adversarial inference approach to extend such deep learning models to learn session-invariant person-discriminative representations that can provide robustness in terms of longitudinal usability. Using adversarial learning within a deep convolutional network, we empirically assess and show improvements with our approach based on longitudinally collected EEG data for person identification from half-second EEG epochs.
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Europe > Austria (0.04)
- Information Technology > Security & Privacy (0.71)
- Health & Medicine > Therapeutic Area > Neurology (0.34)
Identification of Unmodeled Objects from Symbolic Descriptions
Baisero, Andrea, Otte, Stefan, Englert, Peter, Toussaint, Marc
Successful human-robot cooperation hinges on each agent's ability to process and exchange information about the shared environment and the task at hand. Human communication is primarily based on symbolic abstractions of object properties, rather than precise quantitative measures. A comprehensive robotic framework thus requires an integrated communication module which is able to establish a link and convert between perceptual and abstract information. The ability to interpret composite symbolic descriptions enables an autonomous agent to a) operate in unstructured and cluttered environments, in tasks which involve unmodeled or never seen before objects; and b) exploit the aggregation of multiple symbolic properties as an instance of ensemble learning, to improve identification performance even when the individual predicates encode generic information or are imprecisely grounded. We propose a discriminative probabilistic model which interprets symbolic descriptions to identify the referent object contextually w.r.t.\ the structure of the environment and other objects. The model is trained using a collected dataset of identifications, and its performance is evaluated by quantitative measures and a live demo developed on the PR2 robot platform, which integrates elements of perception, object extraction, object identification and grasping.
- Europe > Germany > Baden-Württemberg > Stuttgart Region > Stuttgart (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)